Summarizing Linked Data RDF Graphs Using Approximate Graph Pattern Mining

نویسندگان

  • Mussab Zneika
  • Claudio Lucchese
  • Dan Vodislav
  • Dimitris Kotzinos
چکیده

The Linked Open Data (LOD) cloud brings together information described in RDF and stored on the web in (possibly distributed) RDF Knowledge Bases (KBs). The data in these KBs are not necessarily described by a known schema and many times it is extremely time consuming to query all the interlinked KBs in order to acquire the necessary information. To tackle this problem, we propose a method of summarizing large RDF KBs using approximate RDF graph patterns and calculating the number of instances covered by each pattern. Then we transform the patterns to an RDF schema that describes the contents of the KB. Thus we can then query the RDF graph summary to identify whether the necessary information is present and if so its size, before deciding to include it in a federated query result.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GRAPHIUM: Visualizing Performance of Graph and RDF Engines on Linked Data

Graph size, density, and number of labels negatively impact on the performance of all the engines. Graph summarization seems to be more affected by the graph density and the number of labels. Dense graph is more influenced by the size of the graphs. RDF-3X outperforms the rest of the engines in pattern matching and graph creation. DEX seems to overcome the rest of the engines when the graphs ar...

متن کامل

RDF2Vec: RDF Graph Embeddings for Data Mining

Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsu...

متن کامل

RDF2Vec: RDF Graph Embeddings and Their Applications

Linked Open Data has been recognized as a valuable source for background information in many data mining and information retrieval tasks. However, most of the existing tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that u...

متن کامل

Exploring Linked Data Graph Structures

The true value of Linked Data becomes apparent when datasets are analyzed and understood already at the basic level of data types, constraints, value patterns etc. Such data profiling is especially challenging for Rdf data, the underlying data model on the Web of Data. In particular, graph analysis can be used to gain more insight into the data, induce schemas, or build indices. We present ProL...

متن کامل

Semantic Web Mining using RDF Data

Information on the web is increasing every minute. Redundancy in information is growing rapidly. Data mining is the technique used to extract this data as per the user’s query. Technically data mining analyzing and summarizing it into useful information. Keyword search is an important tool for exploring and searching large data corpuses whose structure is either unknown, or constantly changing....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016